Copyright © 1978 American Telephone and Telegraph Company
The BellSystem Technical Journal
Vol. 57. No. 4. April 1978
Printed in U.S.A.
Loop Plant Modeling:
Statistical Analyses of Costs in Loop Plant
Operations
By D. M. DUNN and J. M. LANDWEHR
(Manuscript received October 10, 1977)
The Serving Area Concept {SAC) involves a new procedure for the
design and administration of the loop plant to reduce operating costs.
Two major problems facing a loop plant engineer considering conver-
sion to SAC are determining which areas should be converted (and in
what order) and assessing the savings resulting from the conversion.
This paper presents methodology and data analysis results useful for
solving such problems. The data analyzed are from the Prototype
District and measure a large number of facility related problems both
before and after conversion to SAC. A cost penalty measure, based on
observed facility problems, is calculated for a given area using data
collected in that area over a certain period of time. The before con-
version data are characterized and modeled in order to quantify the
uncertainty, in the form of a confidence interval, associated with this
cost penalty. Confidence intervals are useful to decide appropriate sizes
for the data collection areas, appropriate lengths of time for data col-
lection, as well as for comparing the results between two or more areas.
The effect of conversion to SAC on the cost penalty measure is also ex-
amined. It is found that after conversion costs are much lower than
before conversion costs, but that costs continue to decrease for at least
9 to 12 months after conversion takes place. The analysis and results
presented here yield methods and guidelines to be used for data col-
lection and analysis in other districts. These can help in reliably
choosing areas for conversion to SAC which will maximize savings.
965
I. INTRODUCTION AND SUMMARY
Investment decisions in the loop plant, like most such investment
decisions in the Bell System, are dependent on careful analyses and the
data which underlie these analyses. This paper describes detailed studies
of a large body of data measuring several kinds of loop plant operations
and costs. The cost measures used are based on the Facility Analysis
Plan for Outside Plant (FAP); this plan, described and discussed in Ref.
1, gives methods for managing the loop plant. The results of this paper
contain guidelines for the use of certain FAP measures, as well as insights
into related characteristics of the data.
The data analyzed here are from the Prototype District Project, 2 a
major effort undertaken to analyze those operating costs of a district that
can be controlled by changes in the design or administration of the loop
network. This involved a nearly three year study of the Passaic District
of New Jersey Bell Telephone Company. Passaic is an urban area with
some small business, scattered apartments, and large old houses. Many
sections were converting from single- to multiple-family dwellings. Much
of the existing loop plant was congested and had maintenance problems.
Thus, conversion to the Serving Area Concept (SAC) 3 was considered
appropriate for much of the district. This conversion involves departures
from dedicated plant design and multipled plant design. 4 Serving area
interfaces, which are basically large boxes containing cable pair inter-
connect points, are installed in appropriate places in the network. Then
cable pairs are permanently connected from the interface to the cus-
tomer, and complements of feeder pairs from the central office to the
interface are supplied as needed. The Facility Analysis Plan, developed
from the Prototype District Project, gives methods for determining when
and where conversion to SAC is appropriate.
The Prototype District Data Base 5 is the key to tracking district ac-
tivities. Each month over 50,000 measurements of district operations
involving facility related problems were recorded. (Many of these mea-
surements were zero.) Data were retained by 50-pair complement by
month for that part of the district undergoing extensive conversion to
serving areas. Data are available from April 1973 through December
1975.
There are many procedures in the Facility Analysis Plan to aid in
understanding costs and potential savings in the management of loop
plant. Among the concepts involved are allocation areas, 1,4 which are
geographical regions used for tracking operating costs and cable usage.
Allocation areas are also basic units of plant for planning additions or
changes in the network such as conversion to SAC. Therefore, in order
to trigger the need for treatment of the network these areas are initially
ranked on the basis of facility problems in each area. This ranking is
based on a weighted linear combination of facility problems normalized
966 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
by the number of assigned pairs in the area. The weights are costs asso-
ciated with the individual problem items and together yield a "Cost
Penalty Per Assigned Pair" (CPPAP). In Ref. 1, the Normalized Yearly
Marginal Operating Cost, which is a generalization of CPPAP, is used as
a basis for their discussion. Other cost calculations include the "Plant
Stabilization Analysis Form" and the "CUCRIT" analysis to compute the
rate of return associated with a given investment strategy. While these
other cost calculations are important and relevant to FAP, the focus of
this paper is on the CPPAP calculation and its component parts.
Three specific reasons motivate the choice of CPPAP for analysis here.
First, it is the initial form used to analyze data in FAP and as such holds
an important position. Second, the cost calculations for CPPAP are linear
combinations of observed quantities and hence directly interpretable.
Third, CPPAP does not require any special factors (e.g., "improvement
factor") as are needed in most of the other measures.
The general purpose of this paper is to give insight into facets of these
data relating to the conversion of selected allocation areas to SAC which
took place during the Prototype District Project. Two important prob-
lems to the loop plant engineer are to determine which of the allocation
areas should be converted (and in what order), and to assess the savings
resulting from the conversion. The data analysis addresses these prob-
lems by modeling the variability of the FAP data. The uncertainty as-
sociated with projected savings is found to decrease as the serving areas
become larger (in assigned pairs) and the data collection period in-
creases.
An exploratory analysis of the before, during, and after conversion
cost measure and its components (Section II) shows that the cost mea-
sure varies widely both across areas and time. Assignment changes, cable
troubles, and defective pairs contribute the most to the level and vari-
ability. A detailed statistical analysis of the before conversion cost data
in Section III is used as a basis to develop confidence intervals (Section
IV) on the "true" cost penalty. These intervals quantify the uncertainty
associated with an observed cost penalty for a given area. They are useful
to decide appropriate sizes for the data collection areas, appropriate
lengths of time for data collection, as well as for comparing the results
between two or more areas. Moreover, confidence intervals show the
trade-off between the size of the data collection area and the data col-
lection period.
Finally, the effect of the conversion on the cost measure is examined
in Section V. A regression equation is developed which models the after
conversion costs in terms of before and during conversion variables as
well as the time since conversion. The major result shows that costs
continue to decrease after conversion takes place. In order to get an
adequate measure of the savings associated with conversion to SAC, one
must collect data for at least nine to twelve months after conversion.
STATISTICAL ANALYSES OF COSTS 967
It should be noted (before proceeding with the data analysis) that
much of the work described was also performed on other savings mea-
sures including the rate of return. The same techniques which are shown
for CPPAP were found useful, but for brevity their results are not
shown.
II. GENERAL CHARACTERISTICS OF THE COST DATA
2.1. Introduction
The purpose of this section is to give some insight into the data used
in the further analyses in this paper. As described above, the analysis
focuses solely on the data in the CPPAP, which is calculated using the
"Allocation Area Problem Ranking Worksheet." 1 This worksheet is
shown in Fig. 1. Column B, the cost factors, are specific to the Prototype
District, but they are also representative of other loop plant districts.
Abbreviations used in Fig. 1 and throughout this section are as follows:
LST — line and station transfer; WOL — wired out of limits; BCT — break
connect-through; CDP — clear defective pair; BPC — break permanent
connection; CIR — control point interconnection; RE — referred to engi-
neer; RTC — reterminated connection; AC-SOD — assignment change
because the originally assigned pair from a service order was found to
be defective; AC-NS — non-service-order assignment change; AC-OTH —
other assignment change; FCT-7AB — 7A or 7B cable trouble associated
respectively with splicing and terminating troubles; FCT-OTH — other
cable trouble; DEF PRS — defective pairs. For definitions and discussion
of these and other loop plant terms, see Ref . 4.
Two of the items on the worksheet were not measured directly in the
data base. They are the BCT and RTC. However, based on engineering
studies in the Prototype District 6 it was determined that these could be
adequately approximated for the Prototype District during the study
period by a fraction of the total facilities assigned, which is measured
in the data base. These studies determined that BCTs were 13 percent
of the facilities assigned and that RTC were 35 percent of facilities as-
signed. Finally, the management of the loop plant used in the Prototype
District was such that there were no CDP, BPC, or CIR. Therefore, in all
further analyses these cost factors are ignored. All other variables, except
the number of defective pairs, are available (monthly) in the data base.
Defective pairs were entered annually from the district's yearly pair
status report. This report gives the pair status (e.g., assigned, defective,
etc.) as of January 1 and is used monthly for the twelve month period
centered at January 1 (i.e., July through June). Thus, the data to be
studied in this section are the monthly values of the CPPAP and the 11
sub-components of CPPAP that were either measured or estimated during
the study.
968 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
AA
DATE OF RANKING
A
B
C
L
1
N
E
#
ITEM
ENTRY
COST
FACTOR
COST
PENALTY
1
2
3
4
5
6
7
8
9
10
11
12
13
14
LST
WOL
BCT
CDP
BPC
CIR
RE
RTC
AC • S. 0. Def
AC - Non S. 0. Def
AC - Other
FCT - 7A, B
FCT - Other
Def Prs
#/YR
DIST
CO
KFT
X 17.52
X 36.81
X 7.64
X 72.70
X 24.64
X 70.55
X 35.15
X 9.48
X 29.35
X 68.14
X 32.63
X 83.32
X 109.00
m
#/YR
#/YR
#/YR
.
M
-
#/YR
#/YR
#/YR
#/YR
a
=
-
-
#/YR
#/YR
#/YR
#/YR
#/YR
#DEFPrX
m
=
-
.
-
X 0.91
=
15
TOTAL COST PENALTY (SUM 1 TO 14)
16
COST PENALTY PER ASSIGNED PAIR
# ASSIGNED PAIRS
-[
LINE 15
Fig. 1 — Allocation area problem ranking worksheet.
2.2. Components of cppap
The CPPAP has 11 non-zero cost components. However, two of those
variables are perfectly correlated since they are both proportions of the
facilities assigned (i.e., BCT and RTC). Therefore, since both the cost
factor (see Fig. 1) and the proportion of facilities assigned associated with
the RTC is higher than that for BCT, it is the RTCs which will be used in
the further analyses in this subsection. In later sections of the paper all
components are used in the calculation of CPPAP.
A numerical summary of the level (mean) and variability (standard
deviation) of the ten cost components for each of the three stages of area
conversion is given by Table I. So that a few extreme data values do not
overwhelm the rest of the data, the 25 percent trimmed mean and
standard deviation were used. Thus these values are based on only the
STATISTICAL ANALYSES OF COSTS 969
Table I — CPPAP component costs for 10 converted areas
Trimmed mean
Trimmed std dev
Variable
Before
During
After
Before
During
After
LST
0.65
0.12
0.0
0.77
0.17
0.0
WOL
0.04
0.0
0.0
0.18
0.0
0.0
RE
0.77
0.04
0.0
1.20
0.19
0.0
RTC
0.53
0.40
0.21
0.22
0.14
0.09
AC-SOD
0.26
0.10
0.01
0.38
0.18
0.12
AC-NS
1.44
3.78
0.46
1.16
3.35
0.52
AC-OTH
1.12
1.47
0.44
0.56
1.41
0.41
FCT-7AB
2.30
4.97
0.21
1.36
4.50
0.61
FCT-OTH
0.19
0.13
0.0
0.82
0.39
0.0
DEF PRS
0.80
0.84
0.85
0.63
0.83
0.89
middle 50 percent of the data. First the trimmed mean across months
for each area in each stage of conversion was computed; the tabled values
are the trimmed mean and trimmed standard deviation of those values
across the 10 converted areas. Focusing on the mean (level) values first,
it is clear that the dollar costs shown in the table vary widely from
component to component as well as for the stages of conversion. Perhaps
the most remarkable change is in the non-service-order assignment
change tickets (AC-NS) which go from $1.44 before to $3.78 during to
$0.46 after. However, considering the physical situation, this type of
behavior is to be expected. During the conversion, many of the cable pairs
are being handled by the nature of the design of an allocation area. This
can cause many of the pairs to become defective and can cause an in-
terruption in the customer's service. The service is restored either by
changing the customer to a new pair (recorded as an AC-NS) or actually
fixing the defective pair (recorded as an FCT-7AB). Note further that the
occurrences of splicing and terminating cable troubles (FCT-7AB) also
peak during conversion and fall to greatly reduced levels in the after
period. However other cable troubles (FCT-OTH) contribute little to
CPPAP. The category of assignment changes due to the originally as-
signed pair from a service order being defective (AC-SOD) drops to very
nearly zero after conversion. Other assignment changes (AC-OTH) is a
major contributor to CPPAP during all three periods of conversion. The
LST, WOL, and RE after conversion all have zero trimmed mean and
standard deviation. The category of defective pairs (DEF PRS) is inter-
esting because its level stays the same from during to after, and its
variability actually increases during this transition. However, since the
defective pair data is only updated annually, these results should be
considered preliminary. More detailed special studies of defective pair
rates have been performed and are included in Ref. 2.
While the table is a helpful summary of overall behavior, it is not useful
in trying to characterize the similarity and differences among the areas
with regard to the components of CPPAP. Graphical displays of multi-
variate data are often useful for gaining insight into the basic structure
970 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
of data. However, they tend to become more complicated and less useful
as the number of variables increases. Based on Table I, it seems fairly
clear that most of the interesting (large and variable) dollar components
of CPPAP are found in the assignment changes, the cable troubles, and
the defective pairs. The costs associated with LSTs, WOLs, REs and RTCs
tend to be both small and fairly stable. Therefore, in the graphical dis-
plays the focus will be on the six largest and most diverse cost compo-
nents.
Figure 2 gives one example of a polygon plot 7 for three of the converted
areas and the mean converted area (i.e., the 25 percent trimmed mean
of the converted areas). The polygon is formed by connecting the value
of each variable plotted on its respective axis (see Fig. 2 key). By exam-
ining the polygons associated with different areas and stages of con-
version it is possible to visually compare and contrast characteristics of
the areas. Note the similarity of the areas for before, during, and to some
extent after. The values in these plots are as in Table I, and show dollar
amounts. The scaling is designed to show most of the variability in these
data without being distorted by a few very large values. Although areas
of a polygon do not directly correspond to the total cost associated with
an allocation area, areas do give some idea of that sum. For example, it
is clear that after conversion the cost penalty is very small compared with
during and before. The anomalous large value of the non-service order
assignment changes (mentioned earlier) is evident in the during period.
The peak on the first axis from the vertical position is this large
value.
2.3. Analysis of CPPAP
To achieve an initial feel for the nature of the CPPAP data, a plot of
these values against time for the individual allocation areas is useful.
Figure 3 shows a sequence of four allocation areas for their entire 33
month data history. Note that the vertical scales on the four plots, which
show dollar cost penalties, are different. While such differences make
across area comparisons difficult, the range of the data (particularly
including converted and non-converted areas) is so large that using a
single scale would obscure much of the available detail. Because there
is a good deal of variability in the CPPAP measure, a non-linear (resistant)
smoother is applied to the data and plotted (as the solid line) along with
the raw values. The resistant smoother used is (3RSR), twice. 8 Since this
smoother is based on moving medians, rather abrupt changes may occur
in the smoothed output. This smoother was selected for just this reason
so that rapid changes in the level of the data (e.g., after conversion) would
not be obscured.
Of these four allocation areas (212 through 215), two were eventually
converted (213 and 214), while the other two were not. For those areas
which have been converted, lines are drawn to indicate the end of the
STATISTICAL ANALYSES OF COSTS 971
ALLOCATION AREA 213
;-f'
BEFORE
DURING
AFTER
ALLOCATION AREA 222
#•
BEFORE
DURING
ALLOCATION AREA 231
BEFORE
-v : -
DURING
MEAN AREA
.1?-
AFTER
^
AFTER
X
MEAN-B MEAN-D MEAN-A
(a)
Fig. 2— (a) Components of CPPAP (radius length 7.12). (b) Key.
before conversion period, and the beginning of the after conversion pe-
riod. Note that these vertical lines are drawn between actual monthly
observations. The data accuracy only allows full month designations of
before, during, or after. For example, in area 214 months 1-5 are before,
months 6-13 are during, and months 14-33 are after conversion.
Analysis of this figure (and others) showing all the area-time histories
gives a considerable amount of insight into the nature of the data.
(i) The CPPAP for the areas where there is no conversion tends to
be more stable than for areas that undergo conversion.
972 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
AC-SOD
DEF PRS
AC-NS
FCT-OTH
AC-OTH
Fig. 2. (continued)
(ii) Fairly large excursions from a smooth value are evident for all
areas. (Note that the resistant smoother is not affected by these unusual
excursions.)
{Hi) The level and variability of the before, during, and after may be
quite different.
(iu) The after conversion behavior of these areas is quite different.
For example, in area 213 the CPPAP drops quickly to a value near zero.
In area 214 there is a slow but steady decline to a near zero value for
CPPAP.
(v) No evident seasonal pattern is visible in this limited amount of
data.
Table II shows a basic summary of the behavior of each of the 10
converted areas for before, during, and after conversion months. The
25 percent trimmed mean and standard deviation are used, as in Table
I, so that the tabled values reflect the bulk of the data. Table II shows
that both the level and variability change during the "life" of an area.
The during period tends to have the highest levels. The after is the lowest
(as would be both expected and presumed because the effect of con-
version is to reduce the occurrence of the costly plant troubles) both in
level and variability. The variability of the before conversion data is quite
high and not uniform across areas.
In summary, based on these and similar displays, CPPAP values appear
to vary quite widely both across allocation areas and stages of conversion.
For those areas which were converted, the level and variability of the
individual components of CPPAP tend to be concentrated in the as-
signment changes, cable troubles, and defective pairs.
STATISTICAL ANALYSES OF COSTS 973
ALLOCATION AREA 212
^T~^
x x
X
10
15 20
MONTHS
25 30
ALLOCATION AREA 213
15 20
MONTHS
Fig. 3 — Rough and smooth CPPAP for various allocation areas.
III. EXPLORATORY AND GRAPHICAL ANALYSES OF BEFORE
CONVERSION DATA
3.1. Motivation
One use of the Prototype District Data Base is to develop methods of
analysis for determining which allocation areas in other districts should
be converted to SAC, using FAP techniques. Related questions concern
how many months of data should be collected before making such de-
cisions, and how large the areas should be in the first place so reliable
decisions can eventually be made. This section explores certain prop-
erties of these data, motivated by these goals. Fluctuations in the cal-
974 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
Table II — CPPAP for 10 converted areas
Trimmed mean
Trimmed std dev
Area
Before
During
After
Before
During
After
209
4.39
6.68
1.90
2.84
5.20
1.10
210
5.97
22.32
3.51
6.52
10.98
2.22
213
8.84
32.56
3.35
3.40
31.65
3.25
214
11.43
11.81
4.18
14.75
7.73
5.98
221
9.58
10.29
1.91
6.86
9.58
0.00
222
9.41
11.64
4.01
3.02
5.03
0.00
227
10.37
10.07
6.67
6.85
4.30
4.95
228
11.73
14.97
4.22
4.91
7.61
1.19
229
14.15
17.00
3.02
10.86
22.88
0.93
231
21.10
17.03
3.32
7.52
7.28
2.04
culated cost penalty across months and across areas can be large, as was
seen in Section III. Thus, statistical methods are needed to help answer
these questions. Since only before conversion data could be used to help
in making decisions regarding conversion, only the before data from the
data base are considered here. The analysis uses the cost measure CPPAP
for reasons described in Section I.
The goal here is to examine the structure of the before conversion data
so as to be led to reasonable methods of analysis (i.e., reasonable as-
sumptions and models) to answer these questions. We concentrate on
searching for and examining certain relationships by studying appro-
priate scatter plots. While certain numerical statistics are also useful for
such purposes, an advantage of plots is that they are more exploratory
in nature. Section IV then presents and uses a specific model, supported
by the data, as a way of answering the questions in the previous para-
graph.
3.2. Analyses
For the following plots, consider the cost penalty x j; for area i and
month;. The mean, x it and standard deviation, s„ of these values across
months for each area were calculated. Only before conversion data were
used, so the number of months differs from one area to another; however,
recall that 13 of the 23 areas were never converted, so for these areas all
33 months are available. Figure 4 plots the standard deviation s, vs. the
average cost penalty x, for all 23 areas. A positive relationship between
these two quantities is very clearly apparent. Such a relationship strongly
violates assumptions that would be desirable and convenient to use.
Another look at this relationship can be obtained by considering the
sizes of the 23 areas. Since the cost penalty x'y is itself an average cal-
culated over the number of pairs in the area (cost penalty per assigned
pair), one might expect the standard deviation of these values, s„ to be
smaller the larger the size of the area. Figure 5 plots s, vs. the number
of assigned pairs in the ith area, p,-. From theoretical grounds one might
expect the relationship between s and p to be of the form s = afVp , for
STATISTICAL ANALYSES OF COSTS 975
20
5 10 15
ALLOCATION AREA MEAN (x,)
Fig. 4 — CPPAP values before conversion.
some a. The points in Fig. 5 look like they might generally follow a re-
lationship like this, plus some scatter. Thus, we fit a curve s = a/y/p to
these points using least squares* and then formed the residuals (s; — s/).
Each residual is plotted against the corresponding x, in Fig. 6. Again a
strong increasing relationship is apparent; the larger the average cost
penalty x,- for an area, the more likely it is that (s, — s,) is positive and
large. Even after removing the effect of area size from the standard de-
viation Si, higher area averages x t are associated with higher area stan-
dard deviations s,.
One approach to answering the questions put forth in Section 3.1
would be to fit an appropriate linear statistical model to these data, and
then make inferences from that model. However, one of the assumptions
underlying the usual fitting of such a model is that of homogeneity of
variance; i.e., the variance of the observations should be constant across
different levels of other variables. Because of the relationships seen
above, it is worthwhile also to consider transformations of CPPAP when
exploring the before conversion data. Some transformed variable quite
possibly could be generally appropriate for later, more formal analysis
than would the raw CPPAP values.
* Weighted least squares were used, for reasons described below.
976 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
200 400 600 800 1000 1200 1400 1600 1800
ASSIGNED PAIRS (/>/)
Fig. 5 — CPPAP values before conversion.
Several transformations of the cost penalties within the family y =
(x + a) b , with a and b specified parameters, were calculated and studied.
Considering the results as a whole, the most satisfactory and interesting
properties appeared using the transformation y = \n(x + 1), which
corresponds to b = 0, with x the CPPAP as before. Thus the following
plots in this section were all constructed using this transformation.
Figure 7 plots the standard deviation (sy),- vs. y,-, with the plotting
character showing the size of the area; "1" for areas with assigned pairs
Pi < 500; "2" for 500 < p,- < 700; "3" for 700 < p,- < 950; "4" for p, > 950.
There appears to be no systematic relation between (sy) and y, although
the two extreme (high and low) values on y possibly suggest a decreasing
trend; certainly there is nothing like the behavior in Fig. 4. Moreover,
the higher number plotting characters tend to be at the bottom of the
plot with the lower numbers at the top, implying that larger areas have
smaller variability, apart from their average value. The area average y,
is plotted against size p, in Fig. 8; these quantities appear unrelated, so
knowing a priori the size of an area does not enable one to say much
about its expected average cost penalty.
Figure 9 shows the standard deviation (sy); plotted against size p,.
There is a downward trend, and one expects larger areas to have smaller
STATISTICAL ANALYSES OF COSTS 977
X
4
-
2
-
X
X
X
X
x
x
X
"4c
x **
x x
x x x
-2
X
X
X
-4
1 1
,
1
5 10 15 20 25
ALLOCATION AREA MEAN (JT/1
Fig. 6 — CPPAP values before conversion.
variability. To see to what extent this trend is accounted for by a sy =
a/y/p relationship, a was obtained by a weighted least squares regression
of (sy)i on l/y/~pi\ the fitted curve is the solid line in Fig. 9. A weighted
regression was used because the variances of the individual points {sy)i
about their expectations t; = o/y/pi depend on the values of r t and m„
the number of months of before data for that area; assuming normality
of the y's, the variance is 0.5'T 2 /(/n,- — 1). (This is derived from the x 2
distribution associated with (sy) 2 .) Thus, weights proportional to the
reciprocal square roots of these variances were used, and the following
three plots are the raw residuals multiplied by these weights.
The residuals syi — s^i are plotted against y, in Fig. 10. No strong
relationship is apparent. Perhaps the points with extremely high and
low y suggest a downward trend, but if these single points are ignored
no structure at all remains. Figure 11 plots each residual against m;, the
number of months of before data for that area. One would like to see a
horizontal band, which would signify no relationship; indeed, the plot
does not suggest any strong relationship. A normal quantile-quantile
probability plot 9 of the residuals is displayed in Fig. 12. This shows
reasonably good normality of the residuals, although the largest value
is somewhat larger than would be expected and there is some bunching
of the residuals, for which we have no explanation.
978 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
u.o
1 -TINY
2 -SMALL
1
3- MEDIUM
4 -LARGE
0.7
-
ir
1
z
2
1
O
1-
1
|0.6
_
>
lit
a
1
o
2
BE
2
<
2
Q
4 1
1
2
| 0.5
3
H
C/l
<
3
3
DC
<
O 0.4
1-
<
o
o
4
2
_l
4
3
<
0.3
09
1
I
4
4
4
1
1.0
2.0 2.5
ALLOCATION AREA MEAN (/,-)
3.0
3.5
Fig. 7 — Values of ln(CPPAP + 1) before conversion.
Thus, for the logarithmic transform of the original cost penalty nicer
behavior results than with the raw variable. An area's standard deviation
is unrelated to its level, but it is related to its size in a reasonable way;
moreover, the residuals from this relationship have reasonable proper-
ties. A number of additional properties of these data were explored, but
to conserve space only a few will be discussed in any detail.
For each month, the mean and standard deviation of the CPPAP values
for all allocation areas for that month were calculated. Figure 13 plots
the monthly standard deviations vs. the monthly average, again using
y = ln(CPPAP + 1). There are 33 points in the plot, one for each month;
of course the points from later months are based on successively fewer
values as areas are converted. No relationship is apparent; this is con-
sistent with the lack of relationship between standard deviation and
mean as calculated for each area in Fig. 7. The monthly average vs. the
month number and the smooth of these data [using 4(3RSR)2, twice, a
non-linear smoother 8 ], are shown in Fig. 14. This suggests somewhat of
a cyclic behavior in the average cost penalty. Local peaks appear around
months 1-2, 12-14, and 26-28. One might hypothesize the existence of
a cyclic 12-month structure to these data due to seasonal local factors
such as weather, churn, and inward and outward movement. However,
STATISTICAL ANALYSES OF COSTS 979
200
600 800 1000 1200
ASSIGNED PAIRS (p,)
1800
Fig. 8 — Values of ln(CPPAP + 1) before conversion.
Fig. 14 does not show such clear behavior that one could extrapolate some
fitted cycle with any confidence. Moreover, recall that the purpose of
these analyses is to develop methods that could be used with (probably
less extensive) data from other districts for decision making. We would
not want to extrapolate a specific seasonal pattern from Fig. 14 to a new
district without careful consideration of similarities and differences
between the new district and the Prototype District. One might, though,
wish to use 12 or 24 months data when arriving at decisions so as to re-
move seasonal effects. The possible seasonal factor is discussed further
in reference to somewhat different purposes in Section V.
Distributional characteristics and the correlation structure of the
transformed observations can also be of interest. Figure 15 gives a normal
quantile-quantile plot of (yy — y».)"Vpf for all areas i and months; be-
fore conversion. This quantity is of interest because some differences
between areas are expected, but can be removed by looking at the de-
viations yy — yi.. No strong monthly effect was seen above, so that pos-
sibility is ignored here; and also it was found earlier that var(y, ; ) is ap-
proximately a 2 / Pi, so the values (y t; - yi.Yyfp~i should have approxi-
mately equal variance. Figure 15 shows that these values are distributed
reasonably closely to the normal distribution.
980 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
200
400
1400
1600
1800
800 1000 1200
ASSIGNED PAIRS (p,).
Fig. 9 — Values of ln(CPPAP + 1) before conversion.
Turning to the possible relationships between areas, a different normal
quantile-quantile plot, calculated from correlations in the following way,
is given in Fig. 16. For each pair of areas k and /, the correlation between
the above (yy — yu)*/pi , i = k and /, was calculated over the before
conversion months common for both areas. This gives 253 (= 23-22/2)
estimated correlations, and we would like to see to what extent these
differ from a random sample of correlations where the true correlation
coefficient is 0. Fisher's z transformation,
1+7"
-w(l^
was used to achieve approximate normality. If the population correlation
is 0, then mean (z) « 0,
(n + 1)
where n is the sample size and z is approximately normally distributed.
For these data each z was divided by the standard deviation corre-
sponding to the number of months n from which it was calculated, and
Fig. 16 is a normal quantile-quantile plot of the standardized z's. A
STATISTICAL ANALYSES OF COSTS 981
X
<i?
0.1
>
UJ
Q
X
X
X
X
H
to
n
X
X X
X X
li-
O
cc
U-
co
<
Q
t/i
UJ
IT
-0.1
-07
X
I
X X
X
x x x
X
X
I I
X
X
I
1.0
1.5 2.0 2.5
ALLOCATION AREA MEAN (/,•)
3.0
Fig. 10 — Values of ln(CPPAP + 1) before conversion.
"perfect" result would have all points on the v = x line, which is drawn
on the plot. However, even if the true correlation were one would not
necessarily expect our standardized z's to scatter exactly about this line
since we do not have 253 correlation coefficients calculated indepen-
dently of one another. Instead they are formed pairwise from 23 vari-
ables, implying some (complicated) structure among them. In Fig. 16
the points are uniformly above, but quite close to the y = x line; the
standardized z's are slightly but consistently larger than would be ex-
pected if all true correlations were 0. The median of the standardized
z's corresponds to a population correlation of about 0.3. Thus there is
evidence of a positive but not large correlation between the values in
different areas at the same point in time. This result is not intuitively
unexpected since geographic proximity is probably the cause. For ex-
ample, a heavy rainstorm may increase cable troubles and hence larger
values of CPPAP. A more exhaustive exploration of the correlation
structure of these data could also consider correlations both between
and within areas at different points in time, i.e., with leads and lags.
Another plot of some interest, Fig. 17, shows y, vs. the distance of each
area from the central office, d,-. Although one might or might not expect
such a relationship, the data strongly suggest that areas further from the
982 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
X
1
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
-0.1
-
X
X
X
-0.2
1
1
I
5 10 15 20 25 30
NUMBER OF MONTHS OF BEFORE DATA FOR AREA (m,l
Fig. 11— Values of ln(CPPAP + 1) before conversion.
central office have higher cost penalties. It would be of interest to have
explanations for this and to see if this relationship generalizes to other
districts. Such investigations are in progress by the authors and others.
However, as with the possible monthly cycle seen above, we would not
necessarily want to extrapolate this in a straightforward way to other
districts. It is also of interest to consider the plot of the weighted residual
(syi — syi) vs. d„ given in Fig. 18. Although the area average may be re-
lated to d„ Fig. 18 shows that the part of the standard deviation not
predicted from the size of the area does not seem related to d,. This latter
result fits in with the previous discovery that the standard deviations
of the y's do not appear to be systematically related to anything except
the size of the area.
The entire set of plots and analyses described in this section were re-
peated using robust estimates of location and scale instead of the sample
mean and standard deviation. The purpose was to see if a small number
of deviant observations might be either causing, or hiding, the rela-
tionships considered above. However, there was no appreciable differ-
ence in the results. The results using the mean and standard deviation,
rather than the more robust statistics, were presented above because of
the widespread familiarity and use of these statistics.
STATISTICAL ANALYSES OF COSTS 983
<$
-0.1
-0.2
XX
X XX
XX
NORMAL PROBABILITY PLOT
23 POINTS ON THE PLOT
-3.5
-2.5
-1.5
-0.5 0.5
THEORETICAL QUANTILES
Fig. 12 — Values of ln(CPPAP + 1) before conversion.
The analyses were also repeated using other cost measures. As in the
case of CPPAP, for each of these measures some transformation of the
original values was discovered which appeared more useful for inter-
pretation and later analysis than was the raw cost measure.
IV. DATA COLLECTION GUIDELINES
4.1. General results
This section makes use of the results from the previous section to
construct guidelines for the collection period and size of future allocation
areas. These guidelines are in the form of confidence intervals for the
984 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
0.80
2.0 2.2
MONTHLY MEAN
Fig. 13 — Values of ln(CPPAP + 1) before conversion.
"true savings" given estimated savings, size of area, and the number of
months of data collection. In addition, methods are presented for ex-
tending these results to local areas with characteristics different from
those of the Prototype District.
Based upon the data analysis of Section III, it is reasonable to use the
following model and analysis. Lety, ; = ln(CPPAP + 1) be the transformed
cost measure for area i and month j. Express this as
yu = m; + en
(1)
where Mi is the "true transformed CPPAP" for this area and ey is the
"error" term corresponding to this month. We wish to make inferences
about the area values m and differences m — hj.
Consider assumptions one can reasonably make concerning the e, ; .
From theoretical grounds it is reasonable to assume that
var(e„) = —
(2)
where p, is the size, in assigned pairs, of the area. The quantity a 2 can
be interpreted as the inherent variability from one assigned pair in one
month, and the error term e„ results from averaging over p, assigned
STATISTICAL ANALYSES OF COSTS 985
15 20
MONTH
Fig. 14 — Values of ln(CPPAP + 1) before conversion.
pairs. This assumption was supported by the analysis of Section III.
Moreover, that analysis showed that the standard deviation (of the
transformed CPPAP) does not seem to be related to any other available
variable.
Considering further assumptions concerning the distribution of the
ey, it would be convenient, natural, and relatively simple if we could
assume that the ey are independently normally (Gaussian) distributed
with mean (and variance from eq. (2)). In support of these assumptions,
it was shown in Section III that VpJ- the estimated e l} (i.e., (y v - —
yi-) , '^ r Pi) were normally distributed after transformation. As for the
independence assumption, these values were found in Section III to have
a positive, although not extremely large, correlation between areas.
However, the independence assumption between areas is important
mainly for the confidence interval comparison of two different areas,
as in eq. (5) below, and a positive correlation implies that that interval
would tend to be conservative, i.e. longer than necessary.
Thus, for purposes of the analysis we assume that the e t; are inde-
pendent normal (0,<r 2 /p,). Thus the estimate £; in eq. (1) isy,.; i.e., the
"true transformed cost penalty" is simply estimated by the average of
986 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
40
I*
5*
-20
-40
-3.5
NORMAL PROBABILITY PLOT
570 POINTS ON THE PLOT
-2.5
-0.5 0.5
THEORETICAL QUANTILES
1.5
2.5
3.5
Fig. 15 — Values of ln(CPPAP + 1) before conversion.
all observations for that area. Furthermore,
o- 2
var(y«.)= (3)
Pi • mi
where m, is the number of months of before conversion values available
for area i. Confidence intervals for m (or m — nk) can be calculated using
eq. (3) and standard normal theory. A 100(1 - a) percent confidence
interval for m is
yi- ± 2
Vp~:
(4)
TTl;
STATISTICAL ANALYSES OF COSTS 987
-1
-2
NORMAL PROBABILITY PLOT
253 POINTS ON THE PLOT
-3.5
-2.5
-1.5
-0.5 0.5
THEORETICAL QUANTILES
3.5
Fig. 16 — Values of ln(CPPAP + 1) before conversion.
where z is the upper 1 — a/2 quantile of the standard normal distribution
and a is an estimate of a described below. (Alternatively a t distribution
could be used, but the degrees of freedom used in estimating a should
be large enough so that the difference in quantiles would be small.)
Similarly, a confidence interval for the difference in "true" CPPAPs for
two areas, m — nk, is
(yt.-yk.) ±z
\pi - mi
-)
1/2
Pk • rrik I
The estimate of a 2 , a 2 , is obtained from the regression
(5)
988 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
6 8 10 12 14 11
DIST FROM CO (d ( )
Fig. 17 — Values of ln(CPPAP + 1) before conversion.
(sy)i =
V^i
+ *i
(6)
where (sy)i is the observed standard deviation of the m, values in area
i, and e, is an error term reflecting the departure of the observed (sy),
from this model. Eq. (6) is obtained from eq. (2) and its use is supported
by Fig. 9 and other analysis in Section III. The variance of e„ given in
Section III, depends on i, so an iterated weighted regression is performed
to obtain a. Our value is 12.40. Thus the variance is effectively estimated
by pooling results across all areas, while allowing for the fact that dif-
ferent sized areas have different variance.
Up to this point all the work in this section has been on variables
measured on the transformed scale, i.e., ln(CPPAP + 1). Recall this
transformation was selected to reduce the dependence of the variability
on the level of CPPAP and to improve normality. Therefore, confidence
intervals are for parameters m, m* which are also transformed. However,
we are interested in having tables (for example) based on the original
data (untransformed) and representing untransformed parameters. This
is simply done by forming the confidence intervals on the transformed
scale and then performing the inverse transformation x = c y — 1.
Shown in Table III are the 95 percent confidence intervals for various
STATISTICAL ANALYSES OF COSTS 989
<£
10 12
DISTANCE TO CO (rf,-)
Fig. 18 — Values of ln(CPPAP + 1) before conversion.
observed values of the CPPAP calculated using eq. (4) and a estimated
from the data. The time (in months) is the number of months used in
forming the average value while the size is in pairs assigned. For example,
suppose one has an area of 750 assigned pairs and has collected data for
12 months. If the computed average CPPAP is $10, the confidence interval
is from $7.47 to $13.29. If the computed CPPAP is $30, the interval is
$22.87 to $39.26. The interpretation is that 95 percent of the time, an
observed CPPAP will be such that the associated interval covers the
"true" CPPAP. Note that these intervals are not symmetric. On the
transformed scale the assumptions yield a symmetric interval. However,
when transforming back to the original scale, the nonlinearity of the
exponentiation results in asymmetric intervals.
From the discussion of the variability of the average computed CPPAP
it is clear that as the size of the area increases the variability decreases.
Similarly, if the number of months used in computing the average CPPAP
increases the variability of the estimate decreases. (In fact, based on eq.
(4), and evident from Table III, the effects are symmetric.) To aid in
assessing the magnitude of these effects Figs. 19 and 20 are provided.
Figure 19 shows the upper and lower confidence limits for an observed
CPPAP of $20 formed by averaging over 12 months, for various values
990 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
CD
O
c
;g
•*—
c
o
o
ca
o »
a.
OJ _■
Q. a)
>». "
G
tMNN
cd c— t- oo oo oo
Oi iD o c- 00 oo
Tt «o a> •«*; t-; o5
id co cn cn cn h
co c— c— t— oo oo
CD «* CO CN CM CM
05 05 C- t- 00 "*
Ol CTJ ■** t> 02 ^_
iD CD t> t- t> 00
ID <N 00 OS O CN
iq <y> oo cm ex cd
i> •*' co co oi cm
ifi CO t^ t^ I s - t~
05 CO ■»* ■* CO CO
■^ id co co t-~ c-
co as c~ co id -^
co ->* id id co co
CO CO CTl CN ID 00
CO ID iD CO CO r-
Hf-NOtNO
q t-; co rn in co
CO -"t ID CO CD CD
CN CM C- CO i-l CD
O O) CM CO C- CN
CN C-^ CO ID Tf Ti<
CO CN CN CN CN CM
•— ID ID C— iD
« CO CN i-4 C-;
_ CN "*' ID 113
I
C
CO ID
Q. "*
O
•CN CD
icqai
) ID iD
t> ID i-l ID C- C~
O CO ID CN t> i-H
O CN CO Tf ** ID
•<* Tl" CN CN 00 Oi
03 cq ■«* o -»t CO
© 00 ""* CN © oi
ID CO CO CO CO CN
t> O i-H CN CO CO
CO CO CT5 CM ID 00
CN C- Tji f- 00 CO
tJ4_ 00 © t- <N CO
O CN tJ4 - Tt ID ID
CN CNCN CNCNCN
00 CO CO CO ID c-
"■* CN iH CT5 i-J lO
ID O 00 CO CO iD
■>* -^ co co co co
a
-a
in
HHt-CNCNt-
O iD 00 t> CO t>
00 --4 CN CO* ■* tJ.'
i-H CN CN CN CN CN
£ OICDCM CO cooo
£ <N 00 Gi CN r-4 CO
CD »-4 CO O CTS 00 t>
Q, ID ■* rj4 CO CO CO
O
C^ i-H OS iD 00 CO
cq oq cn c- # -* oq
ID t-^ i-4 l> ID CO
t- id id Tji -^ -*
iior-cocno
CO CO OJ CM ID 00
STATISTICAL ANALYSES OF COSTS 991
5.0 7.5 10.0
SIZE IN ASSIGNED PAIRS (100's)
15.0
Fig. 19 — Upper and lower confidence interval.
of the size. Both the asymmetry and the decrease in the size of the con-
fidence interval are evident. Note that for the smaller areas the effect
of the asymmetry is greater. Figure 20 is the same type of plot for an
observed value of $30 of CPPAP for an area with 250 assigned pairs for
differing numbers of months. Note that for this very small area, the
confidence limits are quite wide and the effect of the asymmetry is much
greater than that seen in Fig. 19.
Table III can be used to help decide an appropriate size for allocation
areas and an appropriate length of time for data collection. For a given
size and time, the confidence intervals for various observed values of
CPPAP can be read from Table III. For example, if allocation areas are
created of size 500 assigned pairs or larger, and if data are collected for
12 months or longer, then an observed CPPAP of $20 would give a con-
fidence interval of $14.25 to $27.92 — or a shorter interval if the area is
992 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
3 6 9 12
TIME IN MONTHS
Fig. 20 — Upper and lower confidence interval.
larger or the data collection period longer. If the uncertainty in the "true"
CPPAP represented by this interval is acceptable, then allocation areas
could be sized to a minimum of 500 pairs with data collection for a
minimum of 12 months. The uncertainty resulting from alternative
values of size and time can be checked in this way using Table III. When
forming allocation areas in a district and determining the length of time
for data collection, the minimum size and time should be chosen so as
to produce results precise enough for the decision making needs of the
district.
4.2. Extending results to Individual areas
The basic results presented in Table III are given for only three values
of the measured CPPAP, six different collection periods, and six area sizes.
The first and most straightforward extension of this analysis to different
areas and collection periods involves extending the tables using eq. (4)
or by linear interpolation of the given table values. As can be seen from
STATISTICAL ANALYSES OF COSTS 993
Figs. 19 and 20, any linear interpolation is more valid for the range of the
table associated with longer collection times and larger collection areas.
This is simply because the effect of the transformation is more linear for
this range of values.
In the event that users of CPPAP data feel that their areas are signifi-
cantly different from the Prototype District, which is the basis of Table
III, there are several ways in which this analysis can be modified. First,
the constant associated with eq. (6) can be re-estimated using the tech-
niques described in Section 4.1. While the estimation of the weights in
the regression is somewhat more complicated than ordinary least
squares, most commercially available statistical computation packages
allow for this type of estimation. Having computed the constant which
relates variability to size of area, it is a simple matter to generate tables
analogous to Table III.
However, the logarithmic transformation of CPPAP used here for
analysis might not always satisfy the desired assumptions. In this case
a more exploratory analysis should be undertaken. Unfortunately, such
an analysis will require additional statistical computation and display.
The sequence of steps discussed in Section III can serve as a guide for
the analysis, and for checking the appropriateness of various transfor-
mations. Finally, it is possible that no appropriate transformation will
be found. Then the method of analysis employed in this section will not
be adequate.
V. ANALYSIS OF AFTER CONVERSION DATA
5. 1 . Description of analysis
A major concern in the conversion of serving areas to SAC is whether
or not the projected savings are being realized. To help answer this
question the cost penalty data in the periods after conversion are ex-
amined. A regression equation is developed which models the after
conversion costs in terms of before and during conversion variables as
well as the time since conversion. The most important result shows that
the cost penalty continues to decline for the period immediately following
conversion. The implication of these findings on conversion analysis is
that to adequately assess the effect of conversion, cost data must be
collected for a period of nine to twelve months after conversion.
One might assume, a priori, that there will be differences in the con-
verted areas but that such differences would not be related to the before
or during conversion periods. These areas were all rehabilitated using
the same FAP guidelines, so they should start off on the same footing.
Differences might be related to installer productivity or activity, or
geographic considerations of the areas. However, data on such variables
are outside the scope of the Prototype District Data Base and are not
currently available. It is of interest to know to what extent after con-
994 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
version behavior might be explained, and the analyses of this section are
directed at using variables available in the data base to this end.
Since the logarithmic transformation of the before conversion data
satisfied straightforward assumptions needed for analysis (see Section
III), one might expect this transformation also to be reasonable for the
after conversion data unless there are some "structural" changes in the
after conversion period. Our analyses do not indicate any such change,
so the quantity analyzed here is y = ln(CPPAP + 1). Ten of the 23 allo-
cation areas were converted, and each of these areas has from 1 to 20
months of after conversion data. The total number of values (areas •
months) is 100.
We search for a linear description of the 100 y 's of the form
yij = a + aixuj + a 2 x 2 ij + . . . + a e x ei j + ejj (7)
where i denotes area; ; denotes month; X\ is some descriptive or ex-
planatory variable with value xuj for the ith area and ;th month; simi-
larly for x% . . . , xg\ and e i; is the residual which is unexplained, and
which should not be related to any available variable. In accord with the
analysis in Sections III and IV, we assume that var(e, ; ) = a 2 /pi. Thus,
all regressions discussed here are weighted regressions with weights in-
versely proportional to the square roots of these variances. The problem
is to find a good but parsimonious set of variables X\, %% . . . , X£.
5.2. Fitted regression equation
Three classes of potential descriptive variables x are considered. First
are variables which give some characteristics of areas, where these
characteristics can be observed before the after conversion period. Such
a variable has a fixed value for each area (i) across months (J). Examples
include the distance of an area from the central office, the size of an area,
and the average cost penalty for an area before conversion. The second
class of variables concerns seasonal cycles across months. Such a variable
has a value depending on the months (J) but is constant for each area
(i). The third class consists of the single variable giving the number of
months since conversion of that area; thus xy = k, where month ;' is k
months past the conversion date of area i.
Consider the first class of variables. The most powerful such variables
would be a set of 10, with each variable having some non-zero value in
one area and the value zero in all other areas. This gives a one-way
analysis of variance model, with the area corresponding to the treatment
or groups. Doing this, one obtains an R 2 = 0.28. This means that 28
percent of the variation in the y's can be explained by differences be-
tween the areas.
The fit is improved substantially (R 2 = 0.37) by adding to this model
the variable which measures the number of months since conversion.
STATISTICAL ANALYSES OF COSTS 995
However, the further addition of variables allowing different values for
different months — the seasonal or time effect variables — improves the
fit only negligibly. Thus, use of all the variables available here would
result in a model describing about 40 percent of the variability in the
after conversion values. Although this is not a large percentage on an
absolute basis, it is also not negligible, especially considering that this
is variability over months and areas after conversion to SAC.
Now we would like to go further and discover specific characteristics
of the ten areas and specific variables that would give a simpler but still
relatively good descriptive model. The following eight variables mea-
suring characteristics of the areas were considered: the size of an area,
as measured by the number of assigned pairs; distance to central office
along feeder cable, measured in kilofeet; area mean before conversion;
area standard deviation before conversion; area mean during conversion;
area standard deviation during conversion; number of months of before
data available; and number of months during conversion. The above
one-way analysis of variance implies that the maximum descriptive
power of any subset, or transformations, of these variables is 28 per-
cent.
In order to find a small but good set of variables and transformations,
extensive regression analyses were done, including stepwise calculations
and C p analysis. 10 As is often the case in such problems, no small set of
variables clearly stands out as the unique "best" regression equation.
Correlations between explanatory variables can permit several different
sets of variables to fit the data approximately equally well. We will now
discuss one simple model that does fit these data reasonably well.
Variables included in the model are the following: number of months
since conversion; during conversion mean; during conversion standard
deviation; and number of months before conversion. The fitted regres-
sion equation is summarized in Table IV, which gives the regression
coefficients, the estimated standard errors, and the t -values for testing
each coefficient equal to zero. The J? 2 is 0.35 with residual standard error
of 0.44, compared to a standard deviation of 0.54 for the dependent
variable. Thus, use of only four variables gives a fit nearly as tight as can
be obtained when using all possible explanatory variables available here.
Table IV — Fitted equation for after conversion data*
yu= 1.60 -0.044xi +0.44x 2l -1.13x 3 « -0.038x 4 «
Standard error 0.36 0.009 0.14 0.37 0.010
t -statistic 4^41 -4.89 3A9 -3.07 -3.80
* ytj = ln(CPPAP + 1)
x uj = number of months since conversion
*2j = during conversion mean
*3,- = during conversion standard deviation
X4i = number of months before conversion
(x2, X3, and X4 are all the same over all months; hence, the time subscript; is omitted.)
996 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978
No monthly time variable or cyclic time effect is included, since the
analysis showed that they had no additional explanatory power.
Examination of various residual plots is important in determining the
adequacy of this fit. Figure 21 gives a partial residual plot 11 for the
number of months since cut-over (xi) variable. The variable plotted on
the vertical axis is the residual from the regression fit plus the contri-
bution from this variable. Thus, one expects the points to scatter about
a straight line with slope equal to the regression coefficient for xi, here
-0.044. This figure does not suggest any serious inadequacy in the fit
as far as this variable is concerned. Partial residual plots and residual
plots for the other variables, normal q-q plots, and various box plots of
the data and residuals were also examined. They did not show anything
particularly noteworthy.
Consider the interpretation of the variables in the fitted equation. For
variable Xi, the number of months since cut-over, it is not surprising that
the level declines over time after the conversion is completed, since
unknown cable troubles and defective pairs will be discovered and cor-
rected. Figure 21, introduced above, shows graphically that there is a
steadily decreasing trend as the number of months since cut-over in-
creases. There is not an instantaneous decline to a low, constant level.
5 10 15
NUMBER OF MONTHS AFTER CONVERSION
Fig. 21 — Values of ln(CPPAP + 1) after conversion.
STATISTICAL ANALYSES OF COSTS 997
Moreover, this variable (x{) appears with approximately the same neg-
ative coefficient in all "reasonably fitting" sets of variables, while other
individual variables are not so strongly needed in order to obtain an
adequate fit. For variable x 2, the during conversion mean, it seems rea-
sonable that a higher during conversion period (a proxy for the com-
plexity of the conversion activity) will be associated with a larger after
conversion level. However, the interpretations for the during conversion
standard deviation (0:3) and the number of months before conversion
(X4) are not as straightforward. For example, one could speculate that
areas with a high level of during variability have spots of local congestion
causing occasional high costs (i.e., RE's LST's, WOL's, etc.). A large
standard deviation implies that there are also months in which costs are
low. It is just this type of area that can show large savings (and lower
values of CPPAP) after conversion via FAP. The number of months before
conversion could be a proxy for the ranking of the converted areas.
Presumably, the worst areas would be converted earlier. Hence, the
better areas are converted later and the post conversion costs of the
better areas are lower (other things being equal).
VI. ACKNOWLEDGMENT
We are indebted to many members of Department 4511 for their time
and effort in explaining concepts and issues pertaining to the loop plant.
Special thanks are due Nancy Basford who never failed to respond
helpfully to our many queries.
REFERENCES
1. G. W. Aughenbaugh and H. T. Stump, "The Facility Analysis Plan: New Methodology
for Improving Loop Plant Operations," B.S.T.J., this issue.
2. J. 0. Bergholm, private communication.
3. J. 0. Bergholm and P. P. Koliss, "Serving Area Concept — A Plan for Now With a Look
to the Future," Bell Laboratories Record, 50, No. 7 (August 1972), pp. 212-216.
4. N. L. Long, "Loop Plant Modeling: Overview," B.S.T.J., this issue.
5. G. W. Aughenbaugh, N. L. Basford, D. M. Dunn, A. E. Gibson, and J. M. Landwehr,
private communication.
6. A. E. Gibson, personal communication.
7. H. P. Friedman, et al., "A Graphical Way of Describing Changing Multivariate Pat-
terns," Proc. of the Comp. Science and Statistics 6th Annual Symposium on the
Interface (1972), pp. 56-59.
8. J. W. Tukey, Exploratory Data Analysis, Reading, Mass.: Addison-Wesley Publishing
Co., 1977.
9. M. B. Wilk and R. Gnanadesikan, "Probability Plotting Methods for the Analysis of
Data," Biometrika, 55 (1968), pp. 1-17.
10. C. L. Mallows, "Some Comments on C p ," Technometrics, 15 (November 1973), pp.
661-676.
11. W. A. Larsen and S. J. McCleary, "The Use of Partial Residual Plots in Regression
Analysis," Technometrics, 14 (August 1972), pp. 781-790.
998 THE BELL SYSTEM TECHNICAL JOURNAL, APRIL 1978